First dose COVID-19 vaccine coverage amongst adolescents and children in England: an analysis of 3.21 million patients' primary care records in situ using OpenSAFELY

Background: The coronavirus disease 2019 (COVID-19) vaccination programme in England was extended to include all adolescents and children by April 2022. The aim of this paper is to describe trends and variation in vaccine coverage in different clinical and demographic groups amongst adolescents and children in England by August 2022. Methods: With the approval of NHS England, a cohort study was conducted of 3.21 million children and adolescents’ records in general practice in England, in situ and within the infrastructure of the electronic health record software vendor TPP using OpenSAFELY. Vaccine coverage across various demographic (sex, deprivation index and ethnicity) and clinical (risk status) populations is described. Results: Coverage is higher amongst adolescents than it is amongst children, with 53.5% adolescents and 10.8% children having received their first dose of the COVID-19 vaccine. Within those groups, coverage varies by ethnicity, deprivation index and risk status; there is no evidence of variation by sex. Conclusion: First dose COVID-19 vaccine coverage is shown to vary amongst various demographic and clinical groups of children and adolescents.


Introduction
By April 2022, the coronavirus disease 2019 (COVID-19) vaccination programme in England had been expanded to include all adolescents (12-15 year olds) and children (5-11 year olds). Invitations were phased in in several stages, firstly extended to those considered clinically vulnerable or living with a vulnerable adult (adolescents in August 2021 and children in January 2022) with all others being invited soon after (adolescents in September 2021 and children in April 2022). We have extended our existing COVID-19 vaccine analysis pipeline 1 , implemented using the OpenSAFELY platform, to include adolescents and children 2 . The aim of this paper is to describe trends and variation in vaccine coverage in different clinical and demographic groups amongst adolescents and children in England. Our analysis queries the primary care data of 3.21m adolescents and children, a (~40%) subset of the entire population of adolescents and children in England (specifically, this subset is all 5-15 year olds who belong to a GP practice using TPP SystmOne EHR software).

Study design
A retrospective cohort study was conducted using general practice (GP) primary care EHR data from all England GP practices supplied by the EHR vendor TPP. The national vaccination campaign began on 8 th December 2020, therefore we captured vaccination data from 7 th December 2020 to 10 August 2022.

Data access and verification
Access to the underlying identifiable and potentially re-identifiable pseudonymised electronic health record data is tightly governed by various legislative and regulatory frameworks, and restricted by best practice. The data in OpenSAFELY is drawn from General Practice data across England where TPP is the Data Processor. TPP developers (CB, JC, JP, FH, and SH) initiate an automated process to create pseudonymised records in the core OpenSAFELY database, which are copies of key structured data tables in the identifiable records. These are linked onto key external data resources that have also been pseudonymised via SHA-512 one-way hashing of NHS numbers using a shared salt. Bennett Institute for Applied Data Science developers and PIs (CEM, SCB, AJW, WJH, HJC, PI) holding contracts with NHS England have access to the OpenSAFELY pseudonymised data tables as needed to develop the OpenSAFELY tools. These tools in turn enable researchers with OpenSAFELY Data Access Agreements to write and execute code for data management and data analysis without direct access to the underlying raw pseudonymised patient data, and to review the outputs of this code.

Study population
All patients alive and registered with a general practice using TPP in England at the study end date, 10 August 2022, and aged between 5 and 15 on that same date were included in this study. This gives the most up-to-date view of the vaccination landscape amongst the eligible cohort to assist with future planning: patients can only be followed up for vaccination if they are alive and registered. Patients without a recorded sex were excluded.

COVID-19 vaccine status
Vaccination information is transmitted back to patients' primary care records in the days following vaccine administration in a designated centre. Which patients had any recorded COVID-19 vaccine administration code in their primary care record (Pfizer-BioNTech mRNA vaccine, AstraZeneca-Oxford vaccine or Moderna vaccine) was ascertained. The latest available date of vaccinations recorded in the most recent comparable Open-SAFELY-TPP database build were included for those vaccinated up to 10 August 2022. All counts are rounded to the nearest 7.
Key demographic and clinical characteristics of vaccinated groups Patient demographics defined by the national reporting specification (sex, and ethnicity in six broad categories and 16 detailed categories) were extracted. Deprivation was also measured, by the Index of Multiple Deprivation (IMD, in quintiles), derived from the patient's postcode at Lower Super Output Area. Patients with missing data were grouped into an unknown category.

Risk status
The patients who are 'In a risk group' have been identified using the criteria in Table 4 of The Green Book and codelists from SARS-CoV-2 (COVID-19) Vaccine Uptake Reporting Specification Collection 2020/2021 (v1.5.3) as distributed by PRIMIS. This includes patients with: immunosuppression; chronic kidney disease; chronic liver disease; chronic heart disease; chronic respiratory disease; chronic neurological disease (including stroke/TIA, cerebral palsy, or MS); asplenia or dysfunction of the spleen; asthma; diabetes; severe mental illness; learning disabilities and pregnancy. We did not include those living with a vulnerable person in this group due to lack of up-to-date household information.

Software availability
All code for the full data management pipeline-from raw data to completed results for this analysis-and for the OpenSAFELY platform as a whole is available for review at GitHub and archived in Zenodo 3 .

Amendments from Version 1
In this version, in response to reviewer comments we have made some small changes to the text: added more detail to the Methods (full code available in linked repository); described how the detailed ethnicity groups map to the broad groupings; added more details and limitations to the Discussion.
Any further responses from the reviewers can be found at the end of the article REVISED Data management and analysis was performed using the OpenSAFELY software libraries and Python, both implemented using Python 3, with additional analyses carried out using R. Code for data management and analysis as well as codelists is archived online (https://github.com/opensafely/nhs-covid-vaccination-coverage/tree/1.46.1). There is some evidence that coverage is higher amongst children identified as at higher risk of severe COVID-19 (and therefore invited for their first vaccination at an earlier date). First dose uptake is 57.2% (51,373 of 89,859) amongst adolescents "in a risk group", compared to 53.2% (581,749 of 1,094,072) amongst those "not in a risk group"; similarly, 18.0% (21,350 of 118,839) of children "in a risk group" have received their first dose, compared to 10.4% (197,813 of 1,903,328) of children "not in a risk group".

Discussion
Overall, first dose COVID-19 vaccine coverage is lower amongst 5-15 year olds than it is amongst the adult (over 16) population. Coverage amongst children is particularly low: a higher percentage of adolescents have received their first dose in all demographic and clinical subgroups. In both age groups, coverage is shown to vary amongst all but one of the demographic and clinical groups examined (sex being the only breakdown that does not exhibit a difference). Demographic disparities previously observed amongst the adult population are also observed for 5-15 year olds. There is some evidence that those identified as being "in a risk group" are more likely to be vaccinated; this is more apparent in the 5-11 age group than the 12-15 age group. The coverage in the different age and risk groups is likely to be affected by the vaccine campaign length at the time of analysis, particularly for children (only 18 weeks for children not classed as "at risk").
Factors contributing to demographic differences could be further explored in a future study. For example, the proportion of those in each ethnic/deprivation group who are "at risk" or living with a vulnerable person. Our results cover ~40% of England, and may not be fully generalisable to the whole country; for example London is underrepresented. However, we previously showed similar vaccination patterns in adults in this population versus the combined (TPP and EMIS) population (~95% of England) 1,4 and that this population is broadly representative of England as a whole 5 .
For context, our cumulative coverage figures demonstrate that coverage is continuing to increase over time, particularly amongst children. However, in adolescents and children there is not yet any evidence of substantial improvement in the disparities shown here. We encourage readers to view the full report at reports.opensafely.org for more details on the progress of the vaccination over time, to inform vaccination campaigns locally and address any inequalities in vaccination coverage.
Information governance and ethical approval NHS England is the data controller for OpenSAFELY-TPP; TPP is the data processor; all study authors using OpenSAFELY have the approval of NHS England. This implementation of OpenSAFELY is hosted within the TPP environment which is accredited to the ISO 27001 information security standard and is NHS IG Toolkit compliant. Patient data has been pseudonymised for analysis and linkage using industry standard cryptographic hashing techniques; all pseudonymised datasets transmitted for linkage onto OpenSAFELY are encrypted; access to the platform is via a virtual private network (VPN) connection, restricted to a small group of researchers; the researchers hold contracts with NHS England and only access the platform to initiate database queries and statistical models; all database activity is logged; only aggregate statistical outputs leave the platform environment following best practice for anonymisation of results such as statistical disclosure control for low cell counts.
The OpenSAFELY research platform adheres to the obligations of the UK General Data Protection Regulation (GDPR) and the Data Protection Act 2018. In March 2020, the Secretary of State for Health and Social Care used powers under the UK Health Service (Control of Patient Information) Regulations 2002 (COPI) to require organisations to process confidential patient information for the purposes of protecting public health, providing healthcare services to the public and monitoring and managing the COVID-19 outbreak and incidents of exposure; this sets aside the requirement for patient consent. This was extended in November 2022 for the NHS England OpenSAFELY COVID-19 research platform. In some cases of data sharing, the common law duty of confidence is met using, for example, patient consent or support from the Health Research Authority Confidentiality Advisory Group.
Taken together, these provide the legal bases to link patient datasets on the OpenSAFELY platform. GP practices, from which the primary care data are obtained, are required to share relevant health information to support the public health response to the pandemic, and have been informed of the OpenSAFELY analytics platform.
This study was approved by the Health Research Authority (REC reference 20/LO/0651) and by the LSHTM Ethics Board (reference 21863).

Data availability
All data were linked, stored and analysed securely within the OpenSAFELY platform https://opensafely.org/. Data include pseudonymized data such as coded diagnoses, medications and physiological parameters. No free text data are included. All code is shared openly for review and re-use under MIT open license (https://github.com/opensafely/nhs-covid-vaccination-coverage/tree/1.46.1). Detailed pseudonymised patient data is potentially re-identifiable and therefore not shared. We rapidly delivered the OpenSAFELY data analysis platform without prior funding to deliver timely analyses on urgent research questions in the context of the global Covid-19 health emergency: now that the platform is established we are developing a formal process for external users to request access in collaboration with NHS England; details of this process are available at OpenSAFELY.org/onboarding-new-users. • License: MIT

Open Peer Review I confirm that I have read this submission and believe that I have an appropriate level of
Is the 'British or Mixed British' group from the broad White group? It would be helpful to detail which of the broad groups the detailed categories belong to somewhere in the text or figure.
Information governance and ethical approval. "NHS IG Toolkit compliant" is duplicated.

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Partly

Are the conclusions drawn adequately supported by the results? Yes
Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response 06 Jun 2023

Helen Curtis
--Thank you for taking the time to review our work and for the positive comments. In version 2 of the paper we have addressed your comments as follows: The Discussion needs to highlight more that the uptake in the different age and risk groups is likely to be affected by the vaccine campaign length, rather than this being a footnote in the figure, and referenced in the external link. It will be interesting to see whether these variations persist in children when the vaccine had been available for a longer period, or whether the differences narrow.
--We have expanded on this in the Discussion as suggested.
I have a few more specific comments below.
Abstract -Results. Consider adding "by August 2022" at the end of the first sentence.
--We have added this as suggested.
Study design. Instead of defining the cohort study beginning and ending, it might be clearer to describe "the vaccine data were available between 7th December…".
--We have rephrased this.
Key demographic and clinical characteristics of vaccinated groups. The description of the IMD quintiles implies that a higher quintile score relates to greater deprivation, while the figures show that quintile 5 is the least deprived quintile.
--We have removed the confusing mention of this from the methods. --We have now described this in the figure legend.
Information governance and ethical approval. "NHS IG Toolkit compliant" is duplicated.
--Thank you, we have removed the repeated phrase.
this work that could be clarified, which I will highlight below: The introduction to this research note is very brief, but additional context would be useful. For example, information on the status of government restrictions/social distancing measures in relation to the timing of offering vaccines to children and adolescents might be helpful. The aim of the paper was included in the "background" section of the abstract, but was not repeated in the introduction and should also be included here.
The methods were briefly outlined, but again, more detail would be helpful. I appreciate this is a research note and therefore will be less detailed than a full research article, but it appears to currently be well within the word limits so key areas could be expanded. For example, it appears the initial vaccine cohort using the OpenSAFELY platform included individuals belonging to GP practices using both TPP and EMIS, but this subset is only TPP -are there likely to be any expected differences in the study population that may affect generalisability?
I would appreciate it if the authors could further explain/justify why the study population included everyone alive/registered with a GP practice at the end date of the study period.
In the methods section it would be helpful to see a full list of the key demographic and clinical characteristics included in this study. It would also be useful to return to this list in the discussion to discuss any that were not included that may be associated with vaccination of child/adolescents (e.g. parent/carer vaccination status?). In reading the methods, I wasn't entirely clear as to whether data on household risk status was available. Were the authors able to identify children/adolescents living with a vulnerable adult, and therefore were these children/adolescents included in the "in a risk group" category? It would be helpful to clarify this point and if these children/adolescents were not included in the "in a risk group", the implications of this could be discussed further in the discussion section.
The results present descriptive data in a series of figures. However as mentioned in the footnote to the figure, vaccine eligibility for each group (at risk vs not, children vs adolescents) varied over time. I wondered whether the relationship between these groups and demographic factors varied? (e.g. vaccine uptake for children aged 5-11 and "not at risk" was lower, but this may be due to less time eligible -would the differences observed in ethnicity/IMD based on vaccination status be greater for this group? Or to ask a separate question, does the proportion of children/adolescents "at risk" vary among different ethnic groups?) I would appreciate justification in the methods why these groups were not analysed separately. Additionally, as proportions are being compared in this research note, it would be useful to include the relevant statistics to support the statements made in the text.
In the discussion, some of the main findings of this work are clearly summarised, but what are the implications of this work? The authors helpfully point out that vaccine coverage is increasing over time, but readers are encouraged to view the full report. Which details of the full report would be useful for readers to understand? What are the conclusions/lessons to be learned from this research?
Many thanks for the opportunity to review this interesting report and I hope these comments are helpful to the authors.

Is the study design appropriate and is the work technically sound? Partly
Are sufficient details of methods and analysis provided to allow replication by others? Partly

If applicable, is the statistical analysis and its interpretation appropriate? No
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Epidemiology, health services research I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Response 09 Jun 2023

Helen Curtis
Thank you for taking the time to review our work and for the largely positive review and constructive suggestions. We have now submitted a revised version of the article and please see responses to specific points below.
"The introduction to this research note is very brief, but additional context would be useful. For example, information on the status of government restrictions/social distancing measures in relation to the timing of offering vaccines to children and adolescents might be helpful. The aim of the paper was included in the "background" section of the abstract, but was not repeated in the introduction and should also be included here." --We have added the aim to the Introduction but we think that otherwise sufficient detail has been supplied for a research note.
"The methods were briefly outlined, but again, more detail would be helpful. I appreciate this is a research note and therefore will be less detailed than a full research article, but it appears to currently be well within the word limits so key areas could be expanded. For example, it appears the initial vaccine cohort using the OpenSAFELY platform included individuals belonging to GP practices using both TPP and EMIS, but this subset is only TPP -are there likely to be any expected differences in the study population that may affect generalisability?" --We have added a note on TPP coverage to the Discussion and in the Methods we have expanded on the inclusion criteria and demographic groups section as suggested below. All code is openly shared for inspection and re-use by any interested reader in more detailed information on the methods.
"I would appreciate it if the authors could further explain/justify why the study population included everyone alive/registered with a GP practice at the end date of the study period." --We have expanded on this as suggested.
"In the methods section it would be helpful to see a full list of the key demographic and clinical characteristics included in this study. It would also be useful to return to this list in the discussion to discuss any that were not included that may be associated with vaccination of child/adolescents (e.g. parent/carer vaccination status?). In reading the methods, I wasn't entirely clear as to whether data on household risk status was available.
Were the authors able to identify children/adolescents living with a vulnerable adult, and therefore were these children/adolescents included in the "in a risk group" category? It would be helpful to clarify this point and if these children/adolescents were not included in the "in a risk group", the implications of this could be discussed further in the discussion section." --We have expanded on this in the Methods and Discussion as suggested.
"The results present descriptive data in a series of figures. However as mentioned in the footnote to the figure, vaccine eligibility for each group (at risk vs not, children vs adolescents) varied over time. I wondered whether the relationship between these groups and demographic factors varied? (e.g. vaccine uptake for children aged 5-11 and "not at risk" was lower, but this may be due to less time eligible -would the differences observed in ethnicity/IMD based on vaccination status be greater for this group? Or to ask a separate question, does the proportion of children/adolescents "at risk" vary among different ethnic groups?) I would appreciate justification in the methods why these groups were not analysed separately. " --We have now expanded on this in the Discussion.
"Additionally, as proportions are being compared in this research note, it would be useful to include the relevant statistics to support the statements made in the text. " --We have not implemented statistical tests in this short research note as the results presented are sufficient to meet the aims of exploring trends and variation, and are consistent with our previous papers on adult vaccine coverage and with the data presented in our linked full report.
"In the discussion, some of the main findings of this work are clearly summarised, but what are the implications of this work? The authors helpfully point out that vaccine coverage is increasing over time, but readers are encouraged to view the full report. Which details of the full report would be useful for readers to understand? What are the conclusions/lessons to be learned from this research?" --We have expanded this part of the Discussion as suggested.
Competing Interests: No competing interests were disclosed.